Oxidising Agent for Modified Nucleotides

ABSTRACT

This invention relates to the use of metal (VI) oxo complexes to catalyse the selective oxidation of 5hmC residues in polynucleotides to 5fC residues. This may be useful in the identification of modified cytosine residues in a population of polynucleotides comprising a sample nucleotide sequence. A first portion of the population is oxidised with a metal (VI) oxo complex and then the first portion and a second portion of said population are both treated with bisulfite. The residues in the first and second portions that correspond to a cytosine residue in the sample nucleotide sequence are identified following treatment and the identities of these residues are used to determine the modification of the cytosine residue in the sample nucleotide sequence. Methods, reagents and kits are provided.

This invention relates to reagents, and in particular oxidising agents,for use in the detection of modified cytosine residues and the analysisand/or sequencing of nucleic acids that contain modified cytosineresidues.

5-methylcytosine (5mC) is a well-studied epigenetic DNA mark that playsimportant roles in gene silencing and genome stability, and is foundenriched at CpG dinucleotides (1). In metazoa, 5mC can be oxidised to5-hydroxymethylcytosine (5hmC) by the ten-eleven translocation (TET)family of enzymes (2, 3). The overall levels of 5hmC are roughly 10-foldlower than those of 5mC and vary between tissues (4). Relatively highquantities of 5hmC (˜0.4% of all cytosines) are present in embryonicstem (ES) cells, where 5hmC has been suggested to have a role in theestablishment and/or maintenance of pluripotency (2,3, 5-9). 5hmC hasbeen proposed as an intermediate in active DNA demethylation, forexample by deamination or via further oxidation of 5hmC to5-formylcytosine (5fC) and 5-carboxycytosine (5cC) by the TET enzymes,followed by base excision repair involving thymine-DNA glycosylase (TDG)or failure to maintain the mark during replication (10). However, 5hmCmay also constitute an epigenetic mark per se.

It is possible to detect and quantify the level of 5hmC present in totalgenomic DNA by analytical methods that include thin layer chromatographyand tandem liquid chromatography-mass spectrometry (2, 11, 12). Mappingthe genomic locations of 5hmC has thus far been achieved by enrichmentmethods that have employed chemistry or antibodies for 5hmC-specificprecipitation of DNA fragments that are then sequenced (6-8, 13-15).These pull-down approaches have relatively poor resolution (10s to 100sof nucleotides) and give only relative quantitative information that islikely to be subject to distributional biasing during the enrichment.Quantifiable single nucleotide sequencing of 5mC has been performedusing bisulfite sequencing (BS-Seq), which exploits thebisulfite-mediated deamination of cytosine to uracil for which thecorresponding transformation of 5mC is much slower (16). However, it hasbeen recognized that both 5mC and 5hmC are very slow to deaminate in thebisulfite reaction and so these two bases cannot be discriminated (17,18). Two relatively new and elegant single molecule methods have shownpromise in detecting 5mC and 5hmC at single nucleotide resolution.Single molecule real-time sequencing (SMRT) has been shown to detectderivatised 5hmC in genomic DNA (19). However, enrichment of DNAfragments containing 5hmC is required, which leads to loss ofquantitative information (19). 5mC can be detected, albeit with loweraccuracy, by SMRT (19). Furthermore, SMRT has a relatively high rate ofsequencing errors (20), the peak calling of modifications is imprecise(19) and the platform has not yet sequenced a whole genome. Protein andsolid-state nanopores can resolve 5mC from 5hmC and have the potentialto sequence unamplified DNA molecules with further development (21, 22).

The quantitative mapping of 5hmC and 5mC in genomic DNA atsingle-nucleotide resolution has been reported using “oxidativebisulfite” sequencing (oxBS-Seq) methods (23). These methods involve thespecific oxidation of 5hmC to 5fC using potassium perruthenate (KRuO4).

The present inventors have recognised that metal (VI) oxo complexes,such as ruthenate, may be useful in catalysing the selective oxidationof 5hmC residues in polynucleotides to 5fC residues. This may be useful,for example, in methods of “oxidative bisulfite” analysis andsequencing.

An aspect of the invention provides a method of identifying a modifiedcytosine residue in a sample nucleotide sequence comprising;

-   -   (i) providing a population of polynucleotides which comprise the        sample nucleotide sequence,    -   (ii) treating a first portion of said population with a        metal (VI) oxo complex,    -   (iii) treating said first portion of said population and a        second portion of said population with bisulfite, and    -   (iv) identifying the residue in the first and second nucleotide        sequences which corresponds to a cytosine residue in the sample        nucleotide sequence.

In some embodiments, the residue may be identified by sequencing. Forexample, a method of identifying a modified cytosine residue in a samplenucleotide sequence may comprise;

-   -   (i) providing a population of polynucleotides which comprise the        sample nucleotide sequence,    -   (ii) treating a first portion of said population with a        metal (VI) oxo complex,    -   (iii) treating said first portion of said population and a        second portion of said population with bisulfite,    -   (iv) sequencing the polynucleotides in the first and second        portions of the population following steps ii) and iii) to        produce first and second nucleotide sequences, respectively and;    -   (v) identifying the residue in the first and second nucleotide        sequences which corresponds to a cytosine residue in the sample        nucleotide sequence

Suitable sequencing methods are well-known in the art and described inmore detail below.

The residues identified in the first and second nucleotide sequences maybe indicative of a modified cytosine at the corresponding position inthe sample nucleotide sequence i.e. the presence of a modified cytosineat a position in the sample nucleotide sequence may be determined fromthe identity of the residues which are located at the same position inthe first and second nucleotide sequences.

For example, cytosine residues may be present at one or more positionsin the sample nucleic acid sequence. The residues at the correspondingpositions in the first and second nucleotide sequences may beidentified. The presence of a modification, for example a5-substitution, such as 5-methyl or 5-hydroxymethyl substitution, on acytosine in the sample nucleotide sequence may be determined from thecombination of residues which are identified in the first and secondnucleotide sequences respectively (i.e. C and C, U and U, C and U, or Uand C) at the position of the cytosine in the sample nucleotidesequence. The cytosine modifications which are indicated by differentcombinations are shown in table 1.

Treatment with the metal (VI) oxo complex oxidises5-hydroxymethylcytosine (5hmC) residues in the first portion ofpolynucleotides into 5-formylcytosine (5fC) residues. 5fC residues inthe first portion of polynucleotides are subsequently converted intouracil by the bisulfite treatment of step (iii). In some embodiments,treatment with the metal (VI) oxo complex may further oxidise some orall of the 5-formylcytosine (5fC) residues into 5-carboxylcytosine(5caC) residues. 5caC residues in the first portion of polynucleotidesare also converted into uracil by the bisulfite treatment of step (iii).

A metal (VI) oxo complex comprises a metal atom (M) in the +6 oxidationstate coordinated with one or more oxygen atoms.

The metal (VI) atom may be tri- or tetra coordinated i.e. the M6+ atommay be coordinated with three oxygen atoms (e.g. rhenium oxideRh(VI)O₃)) or four oxygen atoms in an oxyanion (e.g. ruthenate Ru(VI)O₄²⁻).

In preferred embodiments, the metal (VI) atom is coordinated with fouroxygen atoms (MO₄ ²⁻) in an oxyanion, preferably with tetrahedralgeometry. Suitable metal (VI) oxo complexes include manganate (Mn(VI)O₄²⁻), ferrate (Fe(VI)O₄ ²⁻), osmate (Os(VI)O₄ ²⁻), ruthenate (Ru(VI)O₄²⁻), or molybate (Mo(VI)O₄ ²⁻).

In some preferred embodiments, the metal (VI) oxo complex is ruthenate(Ru(VI)O₄ ²⁻) or manganate (Mn(VI)O₄ ²⁻), most preferably ruthenate(Ru(VI)O₄ ²⁻).

Metal (VI) oxo complexes may be prepared by any suitable technique andvarious methods are available in the art.

For example, a metal (VI) oxo complex (M(VI)O₄ ²⁻) suitable for use inthe oxidation of hmC may be produced by reduction of the correspondingmetal (VII) oxo complex or metal (VIII) oxo complex (e.g. M(VII)O₄ ⁻ orM(VIII)O₄). Any suitable reduction protocol may be employed, for exampleheating or treatment with hydroxide or peroxide ions. Suitable metal(VI) oxo complexes (M(VI)O₄ ²⁻) may also be produced by oxidation ofmetal oxides and oxo complexes (e.g. MO₂)

Ruthenate (Ru(VI)O₄ ²⁻) may be conveniently prepared by reducingperruthenate (Ru(VII)O₄ ⁻) or ruthenium tetroxide (Ru(VIII)O₄), forexample using iodide, hydroxide (OH⁻), peroxide (O₂ ²⁻) or by heating.Ruthenate may also be conveniently prepared by oxidisation of Rucomplexes, for example using KMnO₄ or HClO (hypochlorite). Ruthenate(Ru(VI)O₄ ²⁻ may also be prepared from ruthenium trioxide (RuO₃), forexample using aqueous base and persulfate.

Manganate (Mn(VI)O₄ ²⁻) may be conveniently prepared by reducingpermanganate (Mn(VII)O₄ ⁻) using hydroxide (OH⁻), peroxide (O₂ ²⁻) or byheating.

Osmate ([Os(VI)O₄(OH)₂]²⁻) may be prepared by reducing Os(VIII)O₄ usinghydroxide (Os(VIII)O₄+2OH⁻->[Os(VI)O₄(OH)₂]²⁻).

Ferrate (Fe(VI)O₄ ²⁻) may be prepared by heating iron filings withpotassium nitrate-; heating iron(III) hydroxide with hypochlorite inalkaline conditions; or by alkaline hypochlorite oxidation of ferricnitrate.

Molybdate (MoO₄ ²⁻) may be prepared by dissolving molybdenum trioxide inalkali (MoO₃+2NaOH→Na₂MoO₄.2H₂O)

Examples of suitable methods for the preparation of ruthenate (VI) oxocomplexes and manganate (VT) oxo complexes are described in more detailbelow.

Metal (VI) oxo complexes may also be obtained from commercial sources(e.g. Alfa Aesar, MA USA; Sigma Aldrich, USA).

Treatment with the metal (VI) oxo complex selectively oxidises5-hydroxymethylcytosine residues in first portion of the population ofpolynucleotides into 5-formylcytosine residues. Substantially no otherfunctionality in the polynucleotide is oxidised by the metal (VI) oxocomplex. The treatment therefore does not result in the reaction of anythymine or 5-methylcytosine residues, where such are present.

The first portion of polynucleotides may be treated with the metal (VI)oxo complex at a sufficient concentration to selective oxidise5-hydroxymethylcytosine residues in the polynucleotides. Suitableconcentrations of the metal (VI) oxo complex for the selectiveoxidisation of 5-hydroxymethylcytosine may be readily determined usingstandard techniques. For example, 0.1 mM to 10 mM and most preferablyabout 1 mM ruthenate may be employed.

Typically, metal (VI) oxo complexes for use in the methods describedherein are stored in concentrated stock solutions, which may for examplehave a concentration which is 5 fold, 10 fold, 100 fold, 150 fold or 500fold greater than the concentration used to treat the first portion ofpolynucleotides. A typical stock solution may be 100 mM to 150 mM.

Oxidation of hmC residues by the metal (VI) oxo complex does not degradeor damage the polynucleotides in the first portion to an extent whichprevents subsequent amplification and/or sequencing of the first portioni.e. the first portion includes sufficient intact or undamagedpolynucleotides following treatment with the metal (VI) oxo complex toallow amplification and/or sequencing and is not totally degraded.

Preferably, the metal (VI) oxo complex causes no degradation or damage,or minimal degradation or damage to the polynucleotides in the firstportion or does not cause substantial degradation or damage to thepolynucleotides.

Polynucleotide damage or degradation may include phosphodiester bondcleavage; 5′ dephosphorylation and/or depurination. Treatment with themetal (VI) oxo complex may not cause substantial phosphodiester bondcleavage; 5′ dephosphorylation; depyrimidination and/or depurination ofthe polynucleotides in the first portion and, preferably causes minimalor no phosphodiester bond cleavage; 5′ dephosphorylation and/ordepurination.

Treatment with the metal (VI) oxo complex may result in the formation ofsome corresponding 5-carboxycytosine product as well as5-formylcytosine. The formation of this product does not negativelyimpact on the methods of identification described herein. Under thebisulfite reaction conditions that are used to convert 5-formylcytosineto uracil, 5-carboxycytosine is observed to convert to uracil also. Itis understood that a reference to 5-formylcytosine that is obtained byoxidation of 5-hydroxymethylcytosine may be a reference to a productalso comprising 5-carboxycytosine that is also obtained by thatoxidization.

Advantageously, the treatment conditions may also preserve thepolynucleotides in a denatured state i.e. denaturing conditions may beemployed. Suitable conditions cause denaturation of the polynucleotideswithout causing damage or degradation. For example, the polynucleotidesmay be treated with the metal (VI) oxo complex under alkali conditions,such as 50 mM to 500 mM NaOH or 50 mM to 500 mM KOH.

Following treatment with the metal (VI) oxo complex, the polynucleotidesin the first portion may be purified. Purification may be performedusing any convenient nucleic acid purification technique. Suitablenucleic acid purification techniques include spin-column chromatography.

The polynucleotides may be subjected to further, repeat treatment withthe metal (VI) oxo complex. Such steps are undertaken to maximise theconversion of 5-hydroxycytosine to 5-formylcytosine. This may benecessary where a polynucleotide has sufficient secondary structure thatis capable of re-annealing. Any annealed portions of the polynucleotidemay limit or prevent access of the metal (VI) oxo complex to thatportion of the structure, which has the effect of protecting5-hydroxycytosine from oxidation.

In some embodiments, the first portion of the population ofpolynucleotides may for example be subjected to multiple cycles oftreatment with the metal (VI) oxo complex followed by purification.

For example, one, two, three or more than three cycles may be performed.

Following treatment with the metal (VI) oxo complex and optionalpurification, the first portion of the population is then treated withbisulfite. A second portion of the population which has not beenoxidised is also treated with bisulfite.

Bisulfite treatment converts both cytosine and 5-formylcytosine residuesin a polynucleotide into uracil. As noted above, where any5-carboxycytosine is present (as a product of the oxidation step), this5-carboxycytosine is converted into uracil in the bisulfite treatment.Without wishing to be bound by theory, it is believed that the reactionof the 5-formylcytosine proceeds via loss of the formyl group to yieldcytosine, followed by a subsequent deamination to give uracil. The5-carboxycytosine is believed to yield the uracil through a sequence ofdecarboxylation and deamination steps. Bisulfite treatment may beperformed under conditions that convert both cytosine and5-formylcytosine or 5-carboxycytosine residues in a polynucleotide asdescribed herein into uracil.

Polynucleotides may be treated with bisulfite by incubation withbisulfite ions (HSO₃ ²⁻). The use of bisulfite ions (HSO₃ ²⁻) to convertunmethylated cytosines in nucleic acids into uracil is standard in theart and suitable reagents and conditions are well known to the skilledperson (52-55). Numerous suitable protocols and reagents are alsocommercially available (for example, EpiTect™, Qiagen NL; EZ DNAMethylation™ Zymo Research Corp CA; CpGenome Turbo BisulfiteModification Kit; Millipore).

A feature of OxBS methods is the conversion of unmethylated cytosine(which may be generated in situ from 5-formylcytosine or5-carboxycytosine) to uracil. This reaction is typically achievedthrough the use of bisulfite. However, in general aspects of theinvention, any reagent or reaction conditions may be used to effect theconversion of cytosine to uracil. Such reagents and conditions areselected such that little or no 5-methylcytosine reacts, and morespecifically such that little or no 5-methylcytosine reacts to formuracil. The reagent, or optionally a further reagent, may also effectthe conversion of 5-formylcytosine or 5-carboxycytosine to cytosine oruracil.

Following the incubation with bisulfite ions, the portions ofpolynucleotides may be immobilised, washed, desulfonated, eluted and/orotherwise treated as required.

Methods using metal (VI) oxo complexes as described herein may be usefulin identifying and/or distinguishing cytosine (C), 5-methylcytosine(5mC), 5-hydroxymethylcytosine (5hmC) in a sample nucleotide sequence.For example, the methods may be useful in distinguishing one residuefrom the group consisting of cytosine (C), 5-methylcytosine (5mC) and5-hydroxymethylcytosine (5hmC) from the other residues in the group.

Preferably, modified cytosine residues, such as 5-hydroxymethylcytosine,in the first portion of said population are not labelled, for examplewith substituent groups, such as glucose, before the oxidisation orreduction of step ii).

The identification of a residue at a position in one or both of thefirst and second nucleotide sequences as cytosine in one or both offirst and second nucleotide sequences is indicative that the cytosineresidue at that position in the sample nucleotide sequence is5-methylcytosine or 5-hydroxymethylcytosine.

5-hydroxymethylcytosine (5hmC) may be identified in the samplenucleotide sequence. A uracil residue at a position in the firstnucleotide sequence which corresponds to a cytosine in the samplenucleotide sequence and a cytosine at the same position in the secondnucleotide sequence are indicative that the cytosine residue at thatposition in the sample nucleotide sequence is 5-hydroxylmethylcytosine(5hmC).

A method of identifying a 5-hydroxymethylcytosine (5hmC) residue in asample nucleotide sequence or distinguishing 5-hydroxymethylcytosinefrom cytosine (C), 5-methylcytosine, and 5-formylcytosine (5fC) in asample nucleotide sequence may comprise;

-   -   (i) providing a population of polynucleotides which comprise the        sample nucleotide sequence,    -   (ii) treating the first portion of said population with a        metal (VI) oxo complex,    -   (iii) further treating said first portion of said population and        a second portion of said population with bisulfite, and;    -   (iv) identifying the residue in the first and second portions of        said population at the same position as a cytosine residue in        the sample nucleotide sequence,    -   wherein the presence of a uracil residue in the first portion        and a cytosine in the second portion is indicative that the        cytosine residue in the sample nucleotide sequence is        5-hydroxylmethylcytosine.

For example, a method of identifying a 5-hydroxymethylcytosine (5hmC)residue in a sample nucleotide sequence or distinguishing5-hydroxymethylcytosine from cytosine (C), 5-methylcytosine, and5-formylcytosine (5fC) in a sample nucleotide sequence may comprise;

-   -   (i) providing a population of polynucleotides which comprise the        sample nucleotide sequence,    -   (ii) treating the first portion of said population with a        metal (VI) oxo complex,    -   (iii) further treating said first portion of said population and        a second portion of said population with bisulfite,    -   (iv) sequencing the polynucleotides in the first and second        portions of the population following steps ii) and iii) to        produce first and second nucleotide sequences, respectively and;    -   (v) identifying the residue in the first and second nucleotide        sequences which corresponds to a cytosine residue in the sample        nucleotide sequence,    -   wherein the presence of a uracil residue in the first nucleotide        sequence and a cytosine in the second nucleotide sequence is        indicative that the cytosine residue in the sample nucleotide        sequence is 5-hydroxylmethylcytosine.

5-methylcytosine (5mC) may be identified in a sample nucleotidesequence. Cytosine at a position in both the first and second nucleotidesequences that correspond to a cytosine residue in the sample nucleotidesequence are indicative that the cytosine residue in the samplenucleotide sequence is 5-methylcytosine (5mC).

A method of identifying 5-methylcytosine in a sample nucleotide sequenceor distinguishing 5-methylcytosine from cytosine (C),5-hydroxymethylcytosine (5hmC) and 5-formylcytosine (5fC) in a samplenucleotide sequence may comprise;

-   -   (i) providing a population of polynucleotides which comprise the        sample nucleotide sequence,    -   (ii) treating the first portion of said population with a        metal (VI) oxo complex,    -   (iii) further treating the first portion of said population and        a second portion of said population with bisulfite, and    -   (v) identifying the residue in the first and second portions of        said population which is at the same position as a cytosine        residue in the sample nucleotide sequence    -   wherein the presence of a cytosine in both the first and second        portions is indicative that the cytosine residue in the sample        nucleotide sequence is 5-methylcytosine (5mC).

For example, a method of identifying 5-methylcytosine in a samplenucleotide sequence or distinguishing 5-methylcytosine from cytosine(C), 5-hydroxymethylcytosine (5hmC) and 5-formylcytosine (5fC) in asample nucleotide sequence may comprise;

-   -   (i) providing a population of polynucleotides which comprise the        sample nucleotide sequence,    -   (ii) treating the first portion of said population with a        metal (VI) oxo complex,    -   (iii) further treating the first portion of said population and        a second portion of said population with bisulfite,    -   (iv) sequencing the polynucleotides in the first and second        portions of the population following steps ii) and iii) to        produce first and second nucleotide sequences, respectively and;    -   (v) identifying the residue in the first and second nucleotide        sequences which corresponds to a cytosine residue in the sample        nucleotide sequence    -   wherein the presence of a cytosine in both the first and second        nucleotide sequences is indicative that the cytosine residue in        the sample nucleotide sequence is 5-methylcytosine (5mC).

Uracil residues at a position in both the first and second nucleotidesequences which correspond to a cytosine in the sample nucleotidesequence are indicative that the cytosine residue in the samplenucleotide sequence is not 5-methylcytosine or 5-hydroxymethylcytosinei.e. the cytosine residue is unmodified cytosine or 5-formylcytosine.

A summary of the cytosine modifications at a position in the samplenucleotide sequence which are indicated by specific combinations ofcytosine and uracil at the position in the first and second nucleotidesequences is shown in Table 1.

The first and second portions of the polynucleotide population may betreated with bisulfite and/or sequenced simultaneously or sequentially.

In some embodiments, treatment of the second portion may not be requiredto identity or distinguish a modified cytosine residue in the samplenucleotide sequence. For example, Table 1 shows that oxidation andbisulfite treatment of the first portion of the polynucleotidepopulation is sufficient to identify 5-methylcytosine in the samplenucleotide sequence. A method of identifying 5-methylcytosine in asample nucleotide sequence or distinguishing 5-methylcytosine fromcytosine (C), 5-hydroxymethylcytosine (5hmC) and 5-formylcytosine (5fC)in a sample nucleotide sequence may comprise;

-   -   (i) providing a population of polynucleotides which comprise the        sample nucleotide sequence,    -   (ii) treating the first portion of said population with a        metal (VI) oxo complex,    -   (iii) further treating the first portion of polynucleotides with        bisulfite,    -   (iv) identifying the residue in the treated first portion of        polynucleotides which is at the same position as a cytosine        residue in the sample nucleotide sequence (i.e. the residue in        the first portion which corresponds to the cytosine residue in        the sample nucleotide sequence),    -   wherein the presence of a cytosine at the position in the        treated first portion is indicative that the cytosine residue in        the sample nucleotide sequence is 5-methylcytosine (5mC).

For example, a method of identifying 5-methylcytosine in a samplenucleotide sequence or distinguishing 5-methylcytosine from cytosine(C), 5-hydroxymethylcytosine (5hmC) and 5-formylcytosine (5fC) in asample nucleotide sequence may comprise;

-   -   (i) providing a population of polynucleotides which comprise the        sample nucleotide sequence,    -   (ii) treating the first portion of said population with a        metal (VI) oxo complex,    -   (iii) further treating the first portion of polynucleotides with        bisulfite,    -   (iv) sequencing the polynucleotides in the population following        steps ii) and iii) to produce a treated nucleotide sequence,        and; (v) identifying the residue in the treated nucleotide        sequence which corresponds to a cytosine residue in the sample        nucleotide sequence,    -   wherein the presence of a cytosine in the treated nucleotide        sequence is indicative that the cytosine residue in the sample        nucleotide sequence is 5-methylcytosine (5mC).

In some embodiments, methods according to any one of the aspects andembodiments set out above may comprise sequencing a first portion ofpolynucleotides which has been oxidised and bisulfite treated; and asecond portion of polynucleotides which has been bisulfite treated.

For example, a method may comprise;

-   -   (i) providing a population of polynucleotides which comprise the        sample nucleotide sequence,    -   (ii) providing first and second portions of the population,    -   (iii) treating the first portion of said population with a        metal (VI) oxo complex,    -   (iv) treating the first and second portions of said population        with bisulfite,    -   (v) sequencing the polynucleotides in the first and second        portions of the population following steps ii), iii) and iv) to        produce first and second nucleotide sequences, respectively and;    -   (vi) identifying the residue in the first and second nucleotide        sequences which corresponds to a cytosine residue in the sample        nucleotide sequence.

The sample nucleotide sequence may be already known or it may bedetermined. The sample nucleotide sequence is the sequence of untreatedpolynucleotides in the population i.e. polynucleotides which have notbeen oxidised, reduced or bisulfite treated. In the sample nucleotidesequence, modified cytosines are not distinguished from cytosine.5-Methylcytosine, 5-formylcytosine and 5-hydroxymethylcytosine are allindicated to be or identified as cytosine residues in the samplenucleotide sequence. For example, methods according to any one of theaspects and embodiments set out above may further comprise;

-   -   providing a third portion of the population of polynucleotides        comprising sample nucleotide sequence; and,    -   sequencing the polynucleotides in the third portion to produce        the sample nucleotide sequence.

The sequence of the polynucleotides in the third portion may bedetermined by any appropriate sequencing technique.

The positions of one or more cytosine residues in the sample nucleotidesequence may be determined. This may be done by standard sequenceanalysis. Since modified cytosines are not distinguished from cytosine,cytosine residues in the sample nucleotide sequence may be cytosine,5-methylcytosine, 5-formylcytosine or 5-hydroxymethylcytosine.

The first and second nucleotide sequences (i.e. the nucleotide sequencesof the first and second portions) may be compared to the samplenucleotide sequence. For example, the residues at positions in the firstand second sequences corresponding to the one or more cytosine residuesin the sample nucleotide sequence may be identified.

In some embodiments, the residue in the first and second portions whichis at the same position as a cytosine residue in the sample nucleotidesequence may be identified.

The modification of a cytosine residue in the sample nucleotide sequencemay be determined from the identity of the nucleotides at thecorresponding positions in the first and second nucleotide sequences(i.e. the nucleotides in the same position as the cytosine residue inthe nucleotide sequences of the first and second portions).

The polynucleotides in the population all contain the same samplenucleotide sequence i.e. the sample nucleotide sequence is identical inall of the polynucleotides in the population.

The effect of different treatments on cytosine residues within thesample nucleotide sequence can then be determined, as described herein.

The sample nucleotide sequence may be a genomic sequence. For example,the sequence may comprise all or part of the sequence of a gene,including exons, introns or upstream or downstream regulatory elements,or the sequence may comprise genomic sequence that is not associatedwith a gene. In some embodiments, the sample nucleotide sequence maycomprise one or more CpG islands.

Suitable polynucleotides include DNA, preferably genomic DNA, and/orRNA, such as genomic RNA (e.g. mammalian, plant or viral genomic RNA),mRNA, tRNA, rRNA and non-coding RNA.

The polynucleotides comprising the sample nucleotide sequence may beobtained or isolated from a sample of cells, for example, mammaliancells, preferably human cells.

Suitable samples include isolated cells and tissue samples, such asbiopsies.

Modified cytosine residues including 5hmC and 5fC have been detected ina range of cell types including embryonic stem cells (ESCS) and neuralcells (2, 3, 11, 45, 46).

Suitable cells include somatic and germ-line cells.

Suitable cells may be at any stage of development, including fully orpartially differentiated cells or non-differentiated or pluripotentcells, including stem cells, such as adult or somatic stem cells, foetalstem cells or embryonic stem cells.

Suitable cells also include induced pluripotent stem cells (iPSCs),which may be derived from any type of somatic cell in accordance withstandard techniques.

For example, polynucleotides comprising the sample nucleotide sequencemay be obtained or isolated from neural cells, including neurons andglial cells, contractile muscle cells, smooth muscle cells, liver cells,hormone synthesising cells, sebaceous cells, pancreatic islet cells,adrenal cortex cells, fibroblasts, keratinocytes, endothelial andurothelial cells, osteocytes, and chondrocytes.

Suitable cells include disease-associated cells, for example cancercells, such as carcinoma, sarcoma, lymphoma, blastoma or germ linetumour cells.

Suitable cells include cells with the genotype of a genetic disordersuch as Huntington's disease, cystic fibrosis, sickle cell disease,phenylketonuria, Down syndrome or Marfan syndrome.

Methods of extracting and isolating genomic DNA and RNA from samples ofcells are well-known in the art. For example, genomic DNA or RNA may beisolated using any convenient isolation technique, such asphenol/chloroform extraction and alcohol precipitation, caesium chloridedensity gradient centrifugation, solid-phase anion-exchangechromatography and silica gel-based techniques.

In some embodiments, whole genomic DNA and/or RNA isolated from cellsmay be used directly as a population of polynucleotides as describedherein after isolation. In other embodiments, the isolated genomic DNAand/or RNA may be subjected to further preparation steps.

The genomic DNA and/or RNA may be fragmented, for example by sonication,shearing or endonuclease digestion, to produce genomic DNA fragments. Afraction of the genomic DNA and/or RNA may be used as described herein.Suitable fractions of genomic DNA and/or RNA may be based on size orother criteria. In some embodiments, a fraction of genomic DNA and/orRNA fragments which is enriched for CpG islands (CGIs) may be used asdescribed herein.

The genomic DNA and/or RNA may be denatured, for example by heating ortreatment with a denaturing agent. Suitable methods for the denaturationof genomic DNA and RNA are well known in the art.

In methods according to any one of the aspects and embodiments set outabove, the genomic DNA and/or RNA may be adapted for sequencing and/orother analysis before oxidation and bisulfite treatment, or bisulfitetreatment alone. The nature of the adaptations depends on the sequencingor analysis method that is to be employed. For example, for somesequencing methods, primers may be ligated to the free ends of thegenomic DNA and/or RNA fragments following fragmentation. Suitableprimers may contain 5mC to prevent the primer sequences from alteringduring oxidation and bisulfite treatment, or bisulfite treatment alone,as described herein. In other embodiments, the genomic DNA and/or RNAmay be adapted for sequencing after oxidation and/or bisulfitetreatment, as described herein.

Following fractionation, denaturation, adaptation and/or otherpreparation steps, the genomic DNA and/or RNA may be purified by anyconvenient technique.

Following preparation, the population of polynucleotides may be providedin a suitable form for further treatment as described herein. Forexample, the population of polynucleotides may be in aqueous solution inthe absence of buffers before treatment as described herein.

Polynucleotides for use as described herein may be single ordouble-stranded.

The population of polynucleotides may be divided into two, three, fouror more separate portions, each of which contains polynucleotidescomprising the sample nucleotide sequence. These portions may beindependently treated and sequenced as described herein.

Preferably, the portions of polynucleotides are not treated to addlabels or substituent groups, such as glucose, to5-hydroxymethylcytosine residues in the sample nucleotide sequencebefore oxidation and/or reduction.

In methods according to any one of the aspects and embodiments set outabove, the first and second portions of polynucleotides from thepopulation may be amplified following treatment as described above. Thismay facilitate further manipulation and/or sequencing. Sequencealterations in the first and second portions of polynucleotides arepreserved following the amplification. Suitable polynucleotideamplification techniques are well known in the art and include PCR. Thepresence of a uracil (U) residue at a position in the first and/orsecond portions of polynucleotide may be indicated or identified by thepresence of a thymine (T) residue at that position in the correspondingamplified polynucleotide. Optionally, the portions of polynucleotidesmay be purified before amplification.

As described above, the residue in the first and second portions ofpolynucleotides at the same position as the cytosine residue in thesample nucleotide sequence may be identified by sequencing thepolynucleotides in the first and second portions of the populationfollowing steps ii) and iii) to produce first and second nucleotidesequences, respectively. Polynucleotides may be adapted after oxidation,reduction and/or bisulfite treatment to be compatible with a sequencingtechnique or platform. The nature of the adaptation will depend on thesequencing technique or platform. For example, for Solexa-Illuminasequencing, the treated polynucleotides may be fragmented, for exampleby sonication or restriction endonuclease treatment, the free ends ofthe polynucleotides repaired as required, and primers ligated onto theends.

Polynucleotides may be sequenced using any convenient low or highthroughput sequencing technique or platform, including Sanger sequencing(38), Solexa-Illumina sequencing (39), Ligation-based sequencing(SOLiD™) (40), pyrosequencing (41); Single Molecule Real Time sequencing(SMRT™) (42, 43); and semiconductor array sequencing (Ion Torrent™)(44).

Suitable protocols, reagents and apparatus for polynucleotide sequencingare well known in the art and are available commercially.

The residue in the first and second portions of polynucleotides at thesame position as the cytosine residue in the sample nucleotide sequencemay be identified by hybridisation-based techniques.

In some embodiments, the residue may be identified using specificoligonucleotide probes which hybridise to the polynucleotides withcytosine at the same position as the cytosine residue in the samplenucleotide sequence but do not hybridise to polynucleotides with anyother residue at this position; or which hybridise to thepolynucleotides with a uracil (or thymine) at the same position as thecytosine residue in the sample nucleotide sequence but do not hybridiseto polynucleotides with any other residue at the position. For example,the residue in the first and second portions may be identified by;

-   -   a) contacting polynucleotides of the first and second portions        of the population with a detection oligonucleotide,    -   wherein the detection oligonucleotide specifically hybridises to        only one of i) polynucleotides of said portions which have a        cytosine at the same position as the cytosine residue in the        sample nucleotide sequence and ii) polynucleotides of said        portions which have a uracil at the same position as the        cytosine residue in the sample nucleotide sequence and;    -   b) determining the hybridisation of the detection        oligonucleotide to the polynucleotides of the first and second        portions.

The presence or amount of hybridisation of the detection oligonucleotideto the polynucleotides is indicative of the identity of the residue inthe first and second portions of said population which corresponds tothe cytosine residue in the sample nucleotide sequence.

In some embodiments, two or more detection oligonucleotides may beemployed. For example, a method may comprise contacting polynucleotidesof the first and second portions of the population with;

-   -   i) a first detection oligonucleotide which specifically        hybridises to polynucleotides of said portions which have a        cytosine at the same position as the cytosine residue in the        sample nucleotide sequence, and;    -   ii) a second detection oligonucleotide which specifically        hybridises to polynucleotides of said portions which have a        uracil at the same position as the cytosine residue in the        sample nucleotide sequence, and;    -   b) determining the hybridisation of the first and second        detection oligonucleotides to the polynucleotides of the first        and second portions,    -   wherein the presence or amount of hybridisation of the first and        second detection oligonucleotide is indicative of the identity        of the residue in the first and second portions of said        population at the same position as the cytosine residue in the        sample nucleotide sequence. For example, hybridisation of the        first but not the second detection oligonucleotide is indicative        that the residue is cytosine and hybridisation of the second but        not the first detection oligonucleotide is indicative that the        residue is uracil.

Suitable protocols, reagents and apparatus for the identification ofnucleotide residues by hybridisation are well known in the art and areavailable commercially. Suitable techniques include dynamic allelespecific hybridisation (26), molecular beacons (27), array-basedtechniques (28) and Taqman™ (36).

In some embodiments, the residue in the first and second portions may beidentified using oligonucleotide probes which hybridise to thepolynucleotides of the first and second portions. Followinghybridisation, the presence or absence of a base mismatch between theprobe and the polynucleotide at the position in the polynucleotideswhich corresponds to the cytosine residue in the sample nucleotidesequence may be determined. For example, the residue in the first andsecond portions may be identified by;

-   -   a) contacting polynucleotides of the first and second portions        of the population with a detection oligonucleotide which        specifically hybridises to the polynucleotides, and;    -   b) determining the presence or absence of a base mismatch        between the detection oligonucleotide and the polynucleotides at        the position of the cytosine residue in the sample nucleotide        sequence.

Base mismatches may be determined by any suitable technique. Suitableprotocols, reagents and apparatus for the identification of basemismatches are well-known in the art and include Flap endonuclease andInvader assays (32), primer extension assays (28), oligonucleotideligation assays (28, 37), denaturing high-performance liquidchromatography (DHPLC) and mismatch-specific endonuclease cleavage (47,49, 50, 51).

In some embodiments, primer extension techniques may be employed toidentify the residue in the first and second portions of polynucleotideswhich is at the same position as the cytosine residue in the samplenucleotide sequence. For example, a method may comprise;

-   -   a) contacting polynucleotides of the first and second portions        of the population with a detection oligonucleotide which        hybridises to the polynucleotides immediately 3′ of the position        of the cytosine residue in the sample nucleotide sequence,    -   b) extending the hybridised detection oligonucleotide to        incorporate a nucleotide which is complementary to the residue        in the polynucleotides which is at the same position as the        cytosine residue in the sample nucleotide sequence, and    -   c) determining the identity of the incorporated nucleotide.

Suitable protocols, reagents and apparatus for use in primer extensionassays are well-known in the art and include Infinium HD (Illumina; 33)and arrayed primer extension (APEX) (28, 34, 35).

In some embodiments, specific amplification techniques may be employedto identify the residue in the first and second portions ofpolynucleotides which corresponds to a cytosine residue in the samplenucleotide sequence. The polynucleotides of the first and secondportions may be amplified using amplification primers which produce anamplification product only when a cytosine residue is present at thesame position in the polynucleotides as the cytosine residue in thesample sequence and not when another residue is present at thisposition; or which produce an amplification product only when a uracilresidue is present at the same position in the polynucleotides as thecytosine residue in the sample sequence and not when another residue ispresent at this position. For example, a method may comprise;

-   -   a) subjecting the first and second portions of the population to        amplification with one or more amplification primers which        either;    -   i) amplify polynucleotides in said portions which have a        cytosine at the position corresponding to the cytosine residue        in the sample sequence to produce an amplification product and        do not amplify polynucleotides in said portions with other bases        at this position; or.    -   ii) amplify polynucleotides in said portions which have a uracil        at the position corresponding to the cytosine residue in the        sample sequence to produce an amplification product and do not        amplify polynucleotides in said portions with other bases at        this position, and;    -   b) determining the presence of an amplification product produced        by said amplification primers.

Suitable amplification techniques are well known in the art and includePCR based techniques such as amplification refractory mutation system(ARMS)-PCR (29), allele-specific PCR (30), allele specific amplification(ASA; 31) and adaptor-ligation-mediated ASA (48).

In any of the methods described above, the detection oligonucleotide(s)and/or amplification primers may be immobilised, for example on beads oran array.

Other suitable techniques for identifying the residue in the first andsecond portions of polynucleotides which is at the same position as thecytosine residue in the sample nucleotide sequence may be employed. Forexample, a method may comprise;

-   -   a) fragmenting the polynucleotides of first and second portions        to produce fragments of the first and second portions,    -   b) determining the size and/or mass of the fragments and    -   c) determining from the size and mass of the fragments of the        first and second populations the identity of the residue in the        first and second nucleotide sequences.

The polynucleotides may be fragmented by any suitable method, includingbase-specific endonuclease techniques (47, 49, 50, 51). Suitable methodsfor determining the size and/or mass of the polynucleotide fragments arealso well-known in the art and include MALDI-MS.

Suitable protocols, reagents and apparatus for use in fragmentation anddetection techniques are well-known in the art and include iPLEX SNPgenotyping (Sequenom) (24, 25).

In other embodiments, the residue in the first and second portions ofpolynucleotides at the same position as the cytosine residue in thesample nucleotide sequence may be identified using one or both of i) aspecific binding member which binds to polynucleotides of said portionswhich have a cytosine at the same position as the cytosine residue inthe sample nucleotide sequence and does not bind to polynucleotides ofsaid portions which do not have a cytosine at this position; and (ii) aspecific binding member which binds to polynucleotides of said portionswhich have a uracil at the same position as the cytosine residue in thesample nucleotide sequence and does not bind to polynucleotides of saidportions which do not have a uracil at this position.

The specific binding member may be contacted with the first and secondportions of polynucleotides and the binding of the member to thepolynucleotides determined. The presence of binding may be indicative ofthe identity of the residue in the first and second portions ofpolynucleotides which is located at the same position as the cytosineresidue in the sample nucleotide sequence.

Suitable specific binding members are well-known in the art and includeantibody molecules, such as whole antibodies and fragments, andaptamers.

The residues at positions in the first and second nucleotide sequenceswhich correspond to cytosine in the sample nucleotide sequence may beidentified.

The modification of a cytosine residue at a position in the samplenucleotide sequence may be determined from the identity of the residuesat the corresponding positions in the first and second nucleotidesequences, as described above.

The extent or amount of cytosine modification in the sample nucleotidesequence may be determined. For example, the proportion or amount of5-hydroxymethylcytosine and/or 5-methylcytosine in the sample nucleotidesequence compared to unmodified cytosine may be determined.

In methods according to any one of the aspects and embodiments set outabove, polynucleotides, for example the population of polynucleotides or1, 2 or all 3 of the first, second and third portions of the population,may be immobilised on a solid support.

Similarly, detection oligonucleotides, amplification primers andspecific binding members may be immobilised on a solid support.

A solid support is an insoluble, non-gelatinous body which presents asurface on which the polynucleotides can be immobilised.

Examples of suitable supports include glass slides, microwells,membranes, or microbeads. The support may be in particulate or solidform, including for example a plate, a test tube, bead, a ball, filter,fabric, polymer or a membrane. Polynucleotides may, for example, befixed to an inert polymer, a 96-well plate, other device, apparatus ormaterial which is used in nucleic acid sequencing or other investigativecontext. The immobilisation of polynucleotides to the surface of solidsupports is well-known in the art. In some embodiments, the solidsupport itself may be immobilised. For example, microbeads may beimmobilised on a second solid surface.

In some preferred embodiments, the first, second and/or thirdpolynucleotides may be immobilised on magnetic beads. This mayfacilitate purification of the polynucleotides between steps.

In methods according to any one of the aspects and embodiments set outabove, the first, second and/or third portions of the population ofpolynucleotides may be amplified before sequencing or other analysis.Preferably, the portions of polynucleotide are amplified following thetreatment with bisulfite.

Suitable methods for the amplification of polynucleotides are well knownin the art.

Following amplification, the amplified portions of the population ofpolynucleotides may be sequenced or otherwise analysed. In someembodiments, specific amplification primers may be employed, such thatthe presence or absence of amplified portions is in itself indicative ofthe identity of the residue in the portion of polynucleotides which islocated at the same position as the cytosine residue in the samplenucleotide sequence.

Nucleotide sequences may be compared and the residues at positions inthe first, second and/or third nucleotide sequences which correspond tocytosine in the sample nucleotide sequence may be identified, usingcomputer-based sequence analysis.

Nucleotide sequences, such as CpG islands, with cytosine modificationgreater than a threshold value may be identified. For example, one ormore nucleotide sequences in which greater than 1%, greater than 2%,greater than 3%, greater than 4% or greater than 5% of cytosines arehydroxymethylated may be identified.

Computer-based sequence analysis may be performed using any convenientcomputer system and software.

Another aspect of the invention provides a kit for use in a method ofidentifying a modified cytosine residue according to any one of theaspects and embodiments set out above, in particular a 5-methylcytosine(5mC) or 5-hydroxymethylcytosine (5hmC), comprising;

-   -   (i) a metal (VI) oxo complex; and,    -   (ii) a bisulfite reagent.

Suitable metal (VI) oxo complexes and bisulfite reagents are describedabove. For example, the metal (VI) oxo complex may be manganate (MnO₄²⁻), ferrate (FeO₄ ²⁻), osmate (OsO₄ ²⁻), ruthenate (RuO₄ ²⁻), ormolybdate oxyanion (MoO₄ ²⁻).

In some preferred embodiments, the metal (VI) oxo complex is ruthenate(RuO₄ ²⁻) or manganate (MnO₄ ²⁻), preferably, ruthenate (RuO₄ ²⁻).

The metal (VI) oxo complex may be supplied in the form of a salt, forexample an alkali metal salt, such as lithium, sodium or potassium salt.Preferably the metal (VI) oxo complex is supplied in the form of apotassium salt.

In some preferred embodiments, the kit may comprise dipotassiumruthenate (K₂RuO₄) or dipotassium manganate (K₂MnO₄).

The metal (VI) oxo complex or salt thereof may be supplied in the formof a solution, preferably an aqueous solution.

The solution may be a concentrated solution for dilution to theappropriate working concentration at the time of performing thepolynucleotide treatment.

A suitable metal (VI) oxo complex solution may have a concentrationwhich is at least 2 fold greater than the working concentration, forexample 2 fold to 100 fold greater, preferably about 10 fold greater.For example, the concentrated solution may comprise 0.1 mM to 10M, 1 mMto 1M, or 5 mM to 20 mM metal (VI) oxo complex, preferably about 10 mM.

Preferably, the concentrated metal (VI) oxo complex solution isalkaline. For example, the solution may have a pH of 8 to 14. A suitablesolution may comprise 0.5M to 5M OH⁻. For example, the metal (VI) oxocomplex may be dissolved in 0.5M to 5M NaOH or KOH.

The bisulfite reagent may be ammonium bisulfite (NH₄HSO₃) or sodiumbisulfite (NaHSO₃). The bisulfite reagent may be in the form of asolution. For example, the kit may comprise a 1M to 10M solution ofNH₄HSO₃ or NaHSO₃.

A kit may further comprise a population of control polynucleotidescomprising one or more modified cytosine residues, for example5-methylcytosine (5mC) or 5-hydroxymethylcytosine (5hmC). In someembodiments, the population of control polynucleotides may be dividedinto one or more portions, each portion comprising a different modifiedcytosine residue.

A kit for use in identifying modified cytosines may include one or morearticles and/or reagents for performance of the method, such as meansfor providing the test sample itself, including DNA and/or RNA isolationand purification reagents, and sample handling containers (suchcomponents generally being sterile).

For example, the kit may further comprise sample preparation reagentsfor the isolation and extraction of genomic DNA or RNA from a cell.Suitable reagents are well known in the art and include solid-phaseanion-exchange chromatography reagents and devices.

A kit may further comprise adaptors or primers for ligation to thetermini of the population of polynucleotides following fragmentation.The nature of the adaptors or primers depends on the sequencing methodbeing used. Suitable primers may contain 5mC to prevent the primersequences from altering during oxidation and bisulfite treatment, orbisulfite treatment alone, as described herein. In some embodiments, theadaptors or primers may comprise a label, such as biotin, to facilitateimmobilisation of the polynucleotides.

A kit may further comprise detection oligonucleotides and/oramplification primers for the identification of the residue at aposition which corresponds to a cytosine in the sample nucleotidesequence. The oligonucleotides and/or primers may comprise a label, suchas biotin, to facilitate detection and/or immobilisation.

The kit may further comprise purification devices and reagents forisolating and/or purifying a portion of polynucleotides, followingtreatment as described herein. Suitable reagents are well known in theart and include gel filtration columns and washing buffers.

The kit may further comprise amplification reagents for amplification,preferably PCR amplification, of the first, second and/or third portionsof the population, following treatment as described herein. Suitableamplification reagents are well known in the art and includeoligonucleotide primers, nucleotides, buffers and/or polymerases.

In some embodiments, the kit may further comprise magnetic beads forimmobilisation of one or more portions of polynucleotides. The magneticbeads may be coated with a specific binding member, such asstreptavidin, for attachment of the polynucleotides.

The kit may include instructions for use in a method of identifying amodified cytosine residue as described above.

A kit may include one or more other reagents required for the method,such as buffer solutions, sequencing and other reagents.

Methods and kits for OxBS sequencing are described in Booth et al (2012)Science 336 934 and PCT/GB2012/051819, which are incorporated herein byreference in their entirety for all purposes.

Various further aspects and embodiments of the present invention will beapparent to those skilled in the art in view of the present disclosure.

All documents mentioned in this specification are incorporated herein byreference in their entirety for all purposes.

“and/or” where used herein is to be taken as specific disclosure of eachof the two specified features or components with or without the other.For example “A and/or B” is to be taken as specific disclosure of eachof (i) A, (ii) B and (iii) A and B, just as if each is set outindividually herein.

Unless context dictates otherwise, the descriptions and definitions ofthe features set out above are not limited to any particular aspect orembodiment of the invention and apply equally to any one of the aspectsand embodiments that are described above.

Certain aspects and embodiments of the invention will now be illustratedby way of example and with reference to the figures described below.

FIG. 1 shows the oxidative bisulfite reaction scheme of the invention:treatment with metal (VI) oxo complex oxidises 5hmC to 5fC and thenbisulfite treatment and NaOH convert 5fC to U. The R group is DNA.

FIG. 2 shows a diagram and table outlining the BS-Seq and oxBS-Seqtechniques. BS-Seq consists of bisulfite treatment of the input DNA andthen amplification followed by sequencing. oxBS-Seq consists of metal(VI) oxo complex treatment of the input DNA, followed by bisulfitetreatment and amplification then sequencing. By comparing the input,BS-Seq and oxBS-Seq outputs C, 5mC and 5hmC can be discriminated, mappedand quantified.

FIG. 3 shows the UV/visible spectrum of 750 μM RuO₄ ²⁻ in 50 mM NaOH.

FIG. 4 shows an HPLC trace of unoxidised nucleotides A, C, T and G.

FIG. 5 shows an HPLC trace of the unoxidised 3H15BP oligonucleotidewhich contains three 5hmC residues.

FIG. 6 shows an HPLC trace of the 3H15BP oligonucleotide followingoxidation with Mn(VI)₄ ²⁻.

FIG. 7 shows an HPLC trace of the 3H15BP oligonucleotide followingoxidation with Mo(VI)O₄ ²⁻.

FIG. 8 shows an HPLC trace of the 3H15BP oligonucleotide followingoxidation with Ru(VI)O₄ ²⁻.

Table 1 shows sequencing outcomes for cytosine and modified cytosinessubjected to various treatments.

Table 2 shows the structures of cytosine (1a), 5-methylcytosine (5mC;1b), 5-hydroxymethylcytosine (5hmC; 1c) and 5-formylcytosine (5fC; 1d)

EXPERIMENTS Materials

Preparation of Alkaline Aqueous M(VI)O₄ ²⁻ Solutions

Solid stocks of potassium ferrate, potassium manganate, potassiumruthenate, potassium osmate dehydrate, rhenium oxide and potassiummolybdate were obtained from commercial sources at the highest puritypossible (Alfa Aesar).

Preparation of Ruthenate (VI) 1. Reduction of Potassium Perruthenate(VII) by Hydroxide.

A 150 mM solution of potassium perruthenate was prepared by dissolvingthe appropriate mass of KRuO₄ in 500 mM NaOH. Complete dissolution ofthe solid was achieved by vortexing, affording a solution goldenyellow/brown in colour. This solution was incubated at 25° C. for 48hours. After the incubation period, the solution turned a deep blood redand was accompanied by the evolution of a gas (Oz).

2KRuO₄+2NaOH->2KNaRuO₄+H₂O+½O₂

Complete conversion within the 48 hour incubation period wasdemonstrated by UV/vis spectrophotometry (FIG. 2).

2. Direct Dissolution of Sodium Ruthenate (VI) in Sodium Hydroxide.

Direct preparation of a 150 mM solution of sodium ruthenate (VI) in 500mM sodium hydroxide was tested. The solid Na₂RuO₄ is particularlyinsoluble and after more than a week at 25° C., complete dissolution ofthe solid under these conditions was not achieved. A UV/visible spectrumof the sparingly soluble fraction that did dissolve was taken and shownto be identical to the spectrum of ruthenate in the literature and tothat prepared via Method 1.

Preparation of Manganate (VI) Decomposition of Potassium Permanaganate(VII) by Heating

Heating causes solid potassium permanagante to decompose to manganatewith the concomitant release of oxygen, according to the equation below:

2KMnO4->K₂MnO₄+MnO₂+O₂

Dissolving solid K₂MnO₄ in 500 mM NaOH affords a green solution ofK₂MnO₄.

UV/Vis Spectrophotometry

Certain alkaline aqueous solutions of M(VI)O₄ ²⁻ are highly coloured andhence are easily characterised by UV/vis spectroscopy.

Spectra of alkaline aqueous M(VI)O₄ ²⁻ solutions were taken using a CaryVarian 100 UV/vis spectrophotometer, in a 1 cm path length quartz glasscuvette (1 mL volume). Typical solution composition was 750 μM M(VI)O₄²⁻ in 50 mM NaOH. Spectra were collected over the range 240-800 nm at aresolution of 1 nm at 25° C. All spectra were subjected to a baselinecorrection comprising subtraction of a 50 mM NaOH solution blank fromeach M(VI)O₄ ²⁻ spectrum.

HPLC Oligonucleotide Oxidation and Digest Assay

A qualitative HPLC assay used to visualize the oxidative conversion of5-hmC to 5-fC. Digestion cuts the oligonucleotide into nucleosidemonomers that can be uniquely resolved by chromatography, each monomer(A, C, G, T, U, 5-fC, 5-caC and 5-hmC) having a characteristic andpredictable retention time defined analytical conditions. This allowsthe qualitative evaluation of (e.g.) the oxidation of 5-hmC to 5-fC.

A 100 uM stock of the 15mer oligonucleotide 3H15BP (5′GAGACGACGTACAGG-3′, where C is 5hmC) was employed. 3H15BP contains three5-hmC residues.

Oligonucleotide Oxidation

Solutions of 8 μM 3H15BP in 50 mM NaOH were prepared (20.75 uL MilliQwater, 1.25 μL 1M NaOH, 2 μL 100 μM 3H15BP) and mixed by brieflyvortexing. These were then incubated at 37° C. for 5 minutes to denatureany secondary structure and then snap cooled on iced water (0° C.) for 5minutes. Oxidation was initiated by the addition to the equilibratedalkaline oligo solution of 2 μL of a 15 mM M(VI)O₄ ²⁻ solution in 50 mMNaOH. Once added, the oxidation solution was mixed by briefly vortexingand then returned to the iced water for 60 minutes. The oxidationreaction was mixed by vortexing twice during the 60 minute oxidationafter 20 and 40 minutes. After each mix, the reaction was returned tothe iced water.

Oxidized Oligonucleotide Purification

After completion of the 60 minute oxidation, the oxidisedoligonucleotide solutions were purified using a pre-washed (4χ 600 uLMilliQ) Roche oligo spin column. Eluate from the column was used as theinput for the digestion reaction.

Oligonucleotide Digestion

Oxidized oligonucleotides were digested using the digestion cocktail (22μL oxidized oligo+23 μL MilliQ+5 uL 10× digestion buffer+0.2 μLdigestion cocktail) for 12 hours at 37° C.

Digestion cocktail was made up of 156 U benzonase+100 U alkalinephosphatase+0.15 mU phosphodiesterase I.

After digestion, samples were passed through a 3 kDa Amicon filter bycentrifugation to remove enzymes from the sample.

HPLC Assay

Digested, oxidised oligonucleotides were analysed by HPLC using anAgilent 1100 HPLC with a flow of 1 mL/min over an Eclipse XDB-C18 3.5μm, 3.0×150 mm column. The column temperature was maintained at 45degrees. Eluting buffers were buffer A (500 mM Ammonium Acetate (Fisher)pH 5), Buffer B (Acetonitrile) and Buffer C (H₂O). Buffer A was held at1% throughout the whole run and the gradient for the remaining bufferswas 0 min-0.5% B, 2 min-1% B, 8 min-4% B, 10 min-95% B.

The retention times of 2′-deoxynucleosides are as follows:2′-deoxy-5-carboxycytidine (1.0 min), 2′-deoxycytidine (1.8 min),2′-deoxy-5-hydroxymethylcytidine (2.1 min), 2′-deoxyuridine (2.7 min),2′-deoxy-5-methylcytidine (4.0 min), 2′-deoxyguanosine (4.5 min),2′-deoxy-5-formylcytidine (5.4 min), 2′-deoxythynidine (5.7 min),2′-deoxyadeosine (7.4 min).

The results of HPLC analysis following traces from oligonucleotideoxidation with manganate, molybdate and ruthenate are shown in FIGS.6-8. Ruthenate was found to efficiently oxidise 5hmC to 5fC. Manganatewas found to oxidise 5hmC to a lesser extent that ruthenate. Nooxidation of 5hmC was observed with molybdate under these conditions.

REFERENCES

-   1. A. M. Deaton et al Genes Dev. 25, 1010 (May 15, 2011).-   2. M. Tahiliani et al. Science 324, 930 (May 15, 2009).-   3. S. Ito et al. Nature 466, 1129 (Aug. 26, 2010).-   4. A. Szwagierczak et al Nucleic Acids Res, (Aug. 4, 2010).-   5. K. P. Koh et al. Cell Stem Cell 8, 200 (Feb. 4, 2011).-   6. G. Ficz et al., Nature 473, 398 (May 19, 2011).-   7. K. Williams et al. Nature 473, 343 (May 19, 2011).-   8. W. A. Pastor et al. Nature 473, 394 (May 19, 2011).-   9. Y. Xu et al. Mol. Cell 42, 451 (May 20, 2011).-   10. M. R. Branco et al Nat. Rev. Genet. 13, 7 (January, 2012).-   11. S. Kriaucionis et al Science 324, 929 (May 15, 2009).-   12. M. Munzel et al. Angew. Chem. Int. Ed. 49, 5375 (July 2010).-   13. H. Wu et al. Genes Dev. 25, 679 (Apr. 1, 2011).-   14. S. G. Jin et al Nuc. Acids. Res. 39, 5015 (July, 2011).-   15. C. X. Song et al. Nat. Biotechnol. 29, 68 (January, 2011).-   16. M. Frommer et al. PNAS. U.S.A. 89, 1827 (March 1992).-   17. Y. Huang et al. PLoS One 5, e8888 (2010).-   18. C. Nestor et al Biotechniques 48, 317 (April, 2010).-   19. C. X. Song et al. Nat. Methods, (Nov. 20, 2011).-   20. J. Eid et al. Science 323, 133 (Jan. 2, 2009).-   21. E. V. Wallace et al. Chem. Comm. 46, 8195 (Nov. 21, 2010).-   22. M. Wanunu et al. J. Am. Chem. Soc., (Dec. 14, 2010).-   23. Booth et al (2012) Science 336 934-   24. Wu H et al. Science. 2010; 329(5990):444-448-   25. van den Boom D DNA Methylation: Methods and Protocols. Vol. 507,    2nd ed (2008):207-227.-   26. Howell W M. et al (January 1999). Nat. Biotechnol. 17(1): 87-8-   27. Abravaya K et al (2003).Clin. Chem. Lab. Med. 41 (4):468-74.-   28. Harbron S et al (2004). Molecular analysis and genome discovery.    London: John Wiley ISBN 0-471-49919-6.-   29. Newton, C. R. et al Nucl. Acids Res. 17:2503-2516, 1989-   30. Wu, D. Y. et al. PNAS USA, 86:2757-2760, 1989-   31 Okayama, H. et al J. Lab. Clin. Med. 114:105-113, 1989-   32. Olivier M (June 2005) Mutat. Res. 573 (1-2): 103-10.-   33. Gunderson K L, (2006 Meth. Enzymol. Methods in Enzymology 410:    359-76.-   34. Syvänen A C (December 2001) Nat. Rev. Genet. 2 (12): 930-42.-   35. Molecular Diagnostics 2^(nd) Edition (2010) edited by George    Patrinos, Wilhelm Ansorge Elseveir ISBN 978-0-12-374537-8-   36. McGuigan F E (2002) Psychiatr. Genet. 12 (3): 133-6-   37. Jarvius et al (2003) Methods in Molecular Biology 212 (2003)    215-228-   38. Sanger, F. et al PNAS USA, 1977, 74, 5463-   39. Bentley et al Nature, 456, 53-59 (2008)-   40. K J McKernan et al Genome Res. (2009) 19: 1527-1541-   41. M Ronaghi et al Science (1998) 281 5375 363-365-   42. Eid et al Science (2009) 323 5910 133-138-   43. Korlach et al Methods in Enzymology 472 (2010) 431-455)-   44. Rothberg et al (2011) Nature 475 348-352-   45. Li et al Nucleic Acids (2011) Article ID 870726-   46. Pfaffeneder, T. et al (2011) Angewandte. 50. 1-6-   47. Maeda et al Hum Immunol. (1990) 27(2):111-21.-   48. Wang et al (2008) Electrophoresis April; 29(7):1490-501-   49. Wolff et al (2008). BioTechniques 44 (2): 193-4, 196, 199-   50. Zhang et al (2005). Nucleic Acids Res. 33: W489-92.-   51 Hung et al (2008) BMC Biotechnology, 8:62-   52. Lister, R. et al (2008) Cell. 133. 523-536-   53. Wang et al (1980) Nucleic Acids Research. 8 (20), 4777-4790-   54. Hayatsu et al (2004) Nucleic Acids Symposium Series No. 48 (1),    261-262-   55. Lister et al (2009) Nature. 462. 315-22

TABLE 1 Oxidation then Regular Bisulfite Bisulfite Base SequencingSequencing Sequencing C C U U 5mC C C C 5hmC C C U

TABLE 2

a)

b)

c)

d)

What is claimed is: 1.-45. (canceled)
 46. A method comprising:selectively converting 5-hydroxymethylcytosine (5-hmC) to5-formylcytosine (5-fC) in a nucleotide sequence by contacting at leasta portion of said nucleotide sequence with a metal (VI) oxo complex. 47.The method of claim 46, further comprising treating said nucleotidesequence with bisulfite.
 48. The method of claim 47, further comprisingsequencing said nucleotide sequence.
 49. The method of claim 47, furthercomprising amplifying said nucleotide sequence.
 50. The method of claim47, wherein said method identifies a presence of 5-methylated cytosine(5-mC) or 5-hmC in said nucleotide sequence.
 51. The method of claim 46,wherein said metal (VI) oxo complex is manganate (Mn(VI)O₄ ²⁻).
 52. Themethod of claim 46, wherein said metal (VI) oxo complex is ruthenate(Ru(VI)O₄ ²⁻).
 53. The method of claim 52, wherein said metal (VI) oxocomplex is K₂RuO₄.
 54. The method of claim 46, wherein said contactingis repeated.
 55. The method of claim 46, wherein said nucleotidesequence comprises DNA.
 56. The method of claim 55, wherein said DNA isgenomic DNA.
 57. The method of claim 46, wherein said nucleotidesequence comprises RNA.
 58. The method of claim 46, wherein saidnucleotide sequence or said portion of said nucleotide sequence isimmobilized.
 59. The method of claim 46, further comprising contactingsaid nucleotide sequence with a detection oligonucleotide.
 60. Themethod of claim 59, wherein said detection oligonucleotide hybridizes toportions of said nucleotide sequence having a cytosine residue or auracil residue.
 61. The method of claim 59, wherein said detectionoligonucleotide is immobilized.
 62. The method of claim 46, wherein saidnucleotide sequence is a mammalian nucleotide sequence.
 63. The methodof claim 46, wherein said nucleotide sequence is from a tissue sample.64. The method of claim 46, further comprising converting a portion ofsaid 5-hydroxymethylcytosine (5-hmC) to 5-carboxymethylcytosine (5-caC).